AITopics | qualitative result

Collaborating Authors

qualitative result

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Aligning Text to Image in Diffusion Models is Easier Than You Think

Neural Information Processing SystemsJun-23-2026, 00:38:24 GMT

While recent advancements in generative modeling have significantly improved text-image alignment, some residual misalignment between text and image representations still remains. Some approaches address this issue by fine-tuning models in terms of preference optimization, etc., which require tailored datasets. Orthogonal to these methods, we revisit the challenge from the perspective of representation alignment--an approach that has gained popularity with the success of REPresentation Alignment (REPA) [46]. We first argue that conventional text-to-image (T2I) diffusion models, typically trained on paired image and text data (i.e., positive pairs) by minimizing score matching or flow matching losses, is suboptimal from the standpoint of representation alignment.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

A.1 Qualitative Results of Bench

Neural Information Processing SystemsJun-22-2026, 21:24:40 GMT

Figure 5: Word clouds of text prompts for the text-only generation (T2I) task (left) and the multimodal generation task (right). Figure 5 visually summarizes the prominent semantic elements in the benchmark prompts for text-only492 (T2I) and multimodal generation tasks. The differentiation of the word clouds reflects task-specific493 features of MMGen-Bench, emphasizing spatial and descriptive details in T2I tasks, while multimodal494 tasks more frequently involve social and interactive scenarios.495 Aspect Objects Relations Attributes Counting Overall Spearman ω 0.469 0.909 0.601 0.839 0.699 As depicted in Figure 6, the distribution of aspect types differs notably between the text-only497 generation (T2I) and multi-modal generation tasks. In the T2I setting, "Objects" dominate with498 38.3%, while "Attributes" and "Relations" also constitute substantial proportions (33.9% and 25.4%,499 respectively).

artificial intelligence, interaction, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

REArtGS: Reconstructing and Generating Articulated Objects via 3DGaussian Splatting with Geometric and Motion Constraints

Neural Information Processing SystemsJun-19-2026, 22:41:06 GMT

Articulated objects, as prevalent entities in human life, their 3D representations play crucial roles across various applications. However, achieving both high-fidelity textured surface reconstruction and dynamic generation for articulated objects remains challenging for existing methods. In this paper, we present REArtGS, a novel framework that introduces additional geometric and motion constraints to 3DGaussian primitives, enabling realistic surface reconstruction and generation for articulated objects. Specifically, given multi-view RGB images of arbitrary two states of articulated objects, we first introduce an unbiased Signed Distance Field (SDF) guidance to regularize Gaussian opacity fields, enhancing geometry constraints and improving surface reconstruction quality. Then we establish deformable fields for 3DGaussians constrained by the kinematic structures of articulated objects, achieving unsupervised generation of surface meshes in unseen states. Extensive experiments on both synthetic and real datasets demonstrate our approach achieves high-quality textured surface reconstruction for given states, and enables high-fidelity surface generation for unseen states.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Asia > China (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

VisDiff: SDF-Guided Polygon Generation for Visibility Reconstruction, Characterization and Recognition

Neural Information Processing SystemsJun-17-2026, 17:06:54 GMT

The ability to capture rich representations of combinatorial structures has enabled the application of machine learning to tasks such as analysis and generation of floorplans, terrains, images, and animations. Recent work has primarily focused on understanding structures with well-defined features, neighborhoods, or underlying distance metrics, while those lacking such characteristics remain largely unstudied. Examples of these combinatorial structures can be found in polygons, where a small change in the vertex locations causes a significant rearrangement of the combinatorial structure, expressed as a visibility or triangulation graphs. Current representation learning approaches fail to capture structures without well-defined features and distance metrics. In this paper, we study the open problem of Visibility Reconstruction: Given a visibility graph G, construct a polygon P whose visibility graph is G.

machine learning, natural language, polygon, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.67)

Add feedback

OV-PARTS: Towards Open-Vocabulary Part Segmentation (Supplementary Material) Coauthor Affiliation Address email

Neural Information Processing SystemsApr-30-2026, 00:36:49 GMT

The supplementary material is organized as follows:1 Implementation Details.(Sec. Except for the Object Mask Prompt and Compositional Prompt Tuning designs,7 we follow the default architecture in the original ZSseg paper. The number of part queries is set to 50.8 All the two-stage baselines are trained with AdamW optimizer with the initial learning rate of 1e-49 and weight decay of 1e-4. A poly learning rate policy with a power of 0.9is adopted.

artificial intelligence, machine learning, pascal-part-116, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

d7a6f4830a18b6974326310478bfa489-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-29-2026, 22:52:27 GMT

artificial intelligence, machine learning, qualitative result, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.96)

Add feedback

4b6538a44a1dfdc2b83477cd76dee98e-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 18:54:08 GMT

In this document, we provide more implementation details of CATs and more results on SPair71k [16], PF-PASCAL [4], and PF-WILLOW [3]. Given resized input images Is,It R256 256 3, we conducted experiments using different feature backbone networks, including DeiT-B [22], DINO [2] and ResNet-101 [5]. For the ResNet-101multi in the paper, we use the best layer subset [15] of (0,8,20,21,26,28,29,30) for SPair-71k, and (2,17,21,22,25,26,28) for PF-PASCAL and PF-WILLOW. We resized the spatial resolution of extracted feature maps to 16 16. The extracted features undergo l-2 normalization and the correlation maps are constructed using dot products.

artificial intelligence, correlation map, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.71)
Information Technology > Sensing and Signal Processing > Image Processing (0.49)

Add feedback

1abc87c67cc400a67b869358e627fe37-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 12:06:28 GMT

artificial intelligence, machine learning, mvtec-ad, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.72)

Add feedback

Supplementary Materials: An Empirical Study of Adder Neural Networks for Object Detection

Neural Information Processing SystemsApr-25-2026, 11:37:54 GMT

As discussed in prior literature [1, 4], one operation of floating-point addition and multiplication have energy costs of 0.9 pJ and 3.7 pJ, respectively. Meanwhile, one operation of 8-bit integer addition and multiplication have 0.03 pJ and 0.2 pJ energy costs, demonstrating much lower cost than floating-point operation. Therefore, it is important to explore whether adder detectors performs well for INT8 quantization. We tried to adopt INT8 post quantization for our Adder FCOS (B+N) model, which suffers 0.8 mAP drop compared with full precision model, as shown in Table A. The energy reduction further increases from 29% to 35%. Note that post training quantization is not optimal for INT8 models, and quantization-aware training may greatly further improve the accuracy.

artificial intelligence, detector, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.41)

Add feedback